Skip to content

Expand attestation related queues#4058

Closed
paulhauner wants to merge 1 commit intosigp:unstablefrom
paulhauner:expand-attn-queues
Closed

Expand attestation related queues#4058
paulhauner wants to merge 1 commit intosigp:unstablefrom
paulhauner:expand-attn-queues

Conversation

@paulhauner
Copy link
Copy Markdown
Member

@paulhauner paulhauner commented Mar 7, 2023

Issue Addressed

NA

Proposed Changes

We've used 16,384 as the maximum length for a bunch of queues since we didn't really expect to see more than that many attestations per slot.

We would see 16,384 attestations per slot once we reach 16384 * 32 = 524288 validators. We're now at ~542k 😬

Having small queues is not a big deal, since the queues only fill up if we're not processing attestations. However, it does result in a bunch of ERRO Failed to send scheduled attestation when we see a block later than the rest of the network.

"Why 32,768?", you may ask. It's the next power-of-two and it seems like a decent long-shot for the validator count in the future. I'm open to other suggestions.

Additional Info

I also ended up increasing the max size of the naive aggregation pool. I don't think this will have any effect since we don't actually expect to hit these limits. I just though it would be nice to be consistent.

@paulhauner paulhauner added work-in-progress PR is a work-in-progress v4.0.0 Mainnet Capella release expected late March 2023 labels Mar 7, 2023
@paulhauner
Copy link
Copy Markdown
Member Author

paulhauner commented Mar 13, 2023

By my math, a queue of 32,768 full of attestations will be 9MiB. So, I don't think we're creating any significant DoS risk by expanding this queue.

Workings:

>>> attestation_data = 8 + 8 + 32 + 32 + 8 + 32 + 8
>>> sig = 96
>>> bitlist = bitlist = 32768 / 64 / 8  # NUM_VALIDATORS_PER_SLOT / MAX_COMMITTEES_PER_SLOT / BITS_PER_BYTE
>>> attestation = bitlist + attestation_data + sig
>>> full_queue_of_attestations = attestation * 32768
>>> full_queue_of_attestations / 1024 / 1024
9.0

@paulhauner paulhauner added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Mar 13, 2023
@paulhauner paulhauner marked this pull request as ready for review March 13, 2023 04:47
Copy link
Copy Markdown
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the whole, but I'm not sure how to decide on the minutiae of the different queue sizes

const MAX_UNAGGREGATED_ATTESTATION_REPROCESS_QUEUE_LEN: usize = 8_192;
/// The maximum number of queued `Attestation` objects that reference an unknown
/// block that will be stored before we start dropping them.
const MAX_UNAGGREGATED_ATTESTATION_REPROCESS_QUEUE_LEN: usize = 32_768;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should keep the reprocess queue limit at half the regular limit, i.e. only bump it to 16K.

I'd like to have some good basis for this decision (queueing theory?) but I don't, other than "if it ain't broke, don't fix it".


/// The maximum size of the channel for re-processing work events.
const MAX_SCHEDULED_WORK_QUEUE_LEN: usize = 3 * MAX_WORK_EVENT_QUEUE_LEN / 4;
const MAX_SCHEDULED_WORK_QUEUE_LEN: usize = MAX_WORK_EVENT_QUEUE_LEN;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I wonder if we should keep this at 3/4?

@paulhauner
Copy link
Copy Markdown
Member Author

Thanks for the comments @michaelsproul. After reading those, looking back at #3239 and doing some more thinking, I'm going to close this PR in favor of #4101.

Expanding the queues comes with some risk and I'm not sure that v4.0.0 is a release that needs any more risk. With #4101 we should be able to gather more information and make a better informed decision in the future, with very little v4.0.0 risk ☺️

@paulhauner paulhauner closed this Mar 17, 2023
@paulhauner paulhauner removed ready-for-review The code is ready for review v4.0.0 Mainnet Capella release expected late March 2023 labels Mar 17, 2023
bors bot pushed a commit that referenced this pull request Mar 21, 2023
## Issue Addressed

NA

## Proposed Changes

Replaces #4058 to attempt to reduce `ERRO Failed to send scheduled attestation` spam and provide more information for diagnosis. With this PR we achieve:

- When dequeuing attestations after a block is received, send only one log which reports `n` failures (rather than `n` logs reporting `n` failures).
- Make a distinction in logs between two separate attestation dequeuing events.
- Add more information to both log events to help assist with troubleshooting.

## Additional Info

NA
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Issue Addressed

NA

## Proposed Changes

Replaces sigp#4058 to attempt to reduce `ERRO Failed to send scheduled attestation` spam and provide more information for diagnosis. With this PR we achieve:

- When dequeuing attestations after a block is received, send only one log which reports `n` failures (rather than `n` logs reporting `n` failures).
- Make a distinction in logs between two separate attestation dequeuing events.
- Add more information to both log events to help assist with troubleshooting.

## Additional Info

NA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants